Skip to content

Conversation

@Smartsheet-JB-Brown
Copy link
Contributor

@Smartsheet-JB-Brown Smartsheet-JB-Brown commented Mar 11, 2025

Bedrock Cost Calculations with Intelligent Prompt Routing Support

Overview

This PR adds support for accurate cost calculations for AWS Bedrock models, with special handling for intelligent prompt routing. The implementation enables precise tracking of costs when using AWS Bedrock's intelligent prompt router feature, which dynamically selects the most appropriate model for each request.

Changes

New Features

  • Dedicated Cost Calculation Module: Created a new cost.ts file that centralizes all cost calculation logic for different providers
  • Intelligent Prompt Router Support: Added support for extracting model information from ARNs returned by intelligent prompt routers
  • Detailed Model Pricing: Implemented comprehensive pricing data for all Bedrock models, including:
    • Claude 3 family (Opus, Sonnet, Haiku)
    • Claude 3.5 family (Sonnet, Haiku)
    • Claude 3.7 Sonnet
    • Llama 3, 3.2, and 3.3 models
    • Amazon Titan models
    • Amazon Nova models
  • Cache Cost Tracking: Added support for calculating costs associated with prompt cache operations (write and read)

Technical Improvements

  • Enhanced Stream Interface: Updated the ApiStreamUsageChunk interface to include the invokedModelId field for tracking which model was actually used
  • ARN Parsing: Added functionality to extract model information from complex ARN formats
  • Comprehensive Test Coverage: Added extensive tests for cost calculations across different models and scenarios

Why These Changes Are Needed

  1. Accurate Cost Tracking: As AWS Bedrock usage grows, accurate cost tracking becomes essential for budgeting and optimization
  2. Intelligent Prompt Router Support: The intelligent prompt router feature dynamically selects models, making it difficult to track costs without this enhancement
  3. Detailed Model Support: AWS continues to add new models with different pricing structures, requiring a more comprehensive approach to cost calculation
  4. Cache Cost Accounting: Prompt caching has different pricing than standard requests and needs to be accounted for separately

Testing

All tests pass successfully, including:

  • Unit tests for cost calculations with various models
  • Tests for intelligent prompt router ARN parsing
  • Tests for cache cost calculations
  • Type verification passes with no errors

Future Considerations

  • As AWS adds new models or changes pricing, the cost calculation module can be easily updated
  • This pattern could be extended to other providers that offer similar dynamic model selection features

Important

Adds cost calculation module for AWS Bedrock models with intelligent prompt routing and updates interfaces and tests for accurate cost tracking.

  • Cost Calculation:
    • New cost.ts module for centralized cost calculation logic.
    • Supports AWS Bedrock models with intelligent prompt routing.
    • Implements detailed pricing for Claude, Llama, Amazon Titan, and Nova models.
    • Handles cache write and read cost calculations.
  • Interface Updates:
    • ApiStreamUsageChunk in stream.ts now includes invokedModelId.
  • Testing:
    • Added tests in cost.test.ts for various models and scenarios.
    • Tests cover intelligent prompt router ARN parsing and cache cost calculations.
  • Misc:
    • Enhanced error handling in bedrock.ts for custom ARN issues.

This description was created by Ellipsis for 886e5db. It will automatically update as commits are pushed.

…into jbbrown/bedrock_cost_calculations

* jbbrown/aws_custom_arn_for_intelligent_prompt_routing:
  Bedrock specific cost calculation, including support for intelligent prompt routing
@changeset-bot
Copy link

changeset-bot bot commented Mar 11, 2025

⚠️ No Changeset found

Latest commit: 886e5db

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Mar 11, 2025
if (invokedModelId) {
// Extract the model name from the ARN
// Example ARN: arn:aws:bedrock:us-west-2:699475926481:inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0
const modelMatch = invokedModelId.match(/\/([^\/]+)(?::|$)/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ARN model extraction logic is duplicated here and in cost.ts. Consider extracting this into a shared utility function to avoid code duplication and ensure consistency.

| "openai-native"
| "vertex"
| "unbound"
| "glama"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential typographical error on line 8 in the provider union where 'glama' is used. Considering that elsewhere in the code (in model costs) the term 'llama' is used, please verify if 'glama' was intended or if it should be corrected to 'llama'.

Suggested change
| "glama"
| "llama"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant